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tH (57) Abstract: A distributed network management system (10) and method of operation. The system (10) includes at least one hub 
server (12) and at least one remote server (16), where the hub server (12) and the remote server (16) communicate with each other. 
The remote server (16) additionally communicates with and monitors one or more network devices (20). In the event that the remote 
server (16) becomes inoperational, the hub server (12) assumes monitoring of the netwoik device (20). For redundancy, primary (12) 

*^ and secondary (14) hub servers can be provided, whetein the primary (12) and secondary (14) hub servers conmiunicate with each 

. other and are capable of communicating with the remote server (16). For further redundancy, primary (16) and secondary (18) remote 
servers can be provided, wherein the primary (16) and secondary (18) remote servers communicate with each odier but independently 

!^ monitor the network devices (20). In the peered remote configuration, the hub server (12) is capable of communicating with either 
of the remote servers (16, 18). V/hcre both the hub servers (12, 14) and the remote servers (16,18) are peered, each hub server (12, 
14) is capable of communicating with each remote server (16, 18). 



BNSDOCID: <WO_0a0321lA1_L> 



\ 



wo 02/03211 PCT/USOO/23728 

DISTRIBUTED NETWORK MANAGEMENT SYSTEM AND METHOD 



I 



BACKGROUND OF THE DSfVENllON 
1. Field of the Invention 
5 This invention pertains generally to network communications, and more particularly 

to monitoring and managing network performance. 
2- Description of the Background Art 

In the operation of interconnected networks, it is often desirable to have a mechanism 
for monitoring the state of equipment and devices in the network. Traditionally, this has been 
1 0 accomplished using a centrally-based network management system, with a plurality of 
individual network management systems feeding up to the central network management 
system in a conventional tree hierarchy. Equipment and devices would similarly feed up to 
the individual network management systems in a conventional tree hierarchy. Unfortunately, 
such a architecture for a network management system does not scale well and does not 
1 5 provide for propagation of state and configuration information among a set of cooperating 
systems, 

BRffiF SUMMARY OF THE INVENTION 
The present invention is a scalable distributed network management system with the 
potential for full redundancy at hub and remote levels. The r^otes monitor state changes of 
20 network devices, and those state changes propagate bidixectionally between hubs and 

remotes. Furthermore, configuration changes for designating the monitoring parameters of 
the remotes propagate bidirectionally between remotes and hubs. 

By way of example, and not of limitation, the system includes at least one hub server 
and at least one remote server, where the hub server and the remote server communicate with 
25 each other. The remote server additionally communicates with and monitors one or more 
network devices. In the event that the remote server becomes inoperatioiial, the hub server 
assumes monitoring of the network device(s). 

According to another aspect of the invention, for redundancy, primary and secondary 
hub servers can be provided, wherein the primary and secondary hub servers communicate 
30 with each other. In this peered hub configuration, if the primary hub server becomes 
inoperational and the secondary hub server is operational, the secondary hub server 
communicates with the remote server. Additionally, in the peered hub configuration, if botii 
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the primary hub server and the remote server are inoperational, the secondary hub server 
assumes monitoring of the network devices. 

According to another aspect of the invention^ for redundancy^ primary and secondary 
remote servers can be provided, wiierein the primary and secondary remote servers 
5 commimicate with each other but independently monitor the network devices. In the peered 
remote configuration, if the primary remote server becomes inoperational, the primary hub 
communicates with the secondary remote. 

According to a still further aspect of the invention, if the remotes and the hubs are 
peered and the primary hub is inoperational, the secondary hub commimicates with the 

1 0 primary remote thereby temporarily assuming the duties of the primary hub. Also in the 

peered hub and peered remote configuration, if both the primary hub and primary remote are 
inoperational, the secondary hub communicates with the secondary remote. If both remotes 
are inoperational, then all active hubs assume monitoring of the network devices. 

To facilitate monitoring of network devices, the invention derives state mformation 

15 firom network devices using \^diat is referred to herein as the I^igh/TimP^^ In 
LTP, a plurality of pings is sent firom an ICMP servCT to an inten&ce address on a network 
device during a polling interval. The number of pings rehimed firom said network device is 
recorded and converted to a percentage based on the ratio of the number of pings sent to the 
number of pings received- Next, an SNMP query is sent to the network device and the 

20 opemtional status of the network device, such as "up", "down" or "unknown" is determined 
from the SNMP query. Using the percentage of pings returned and the SNMP status, a status 
percentage for the polUng period is generated by multiplying the percentage piags returned by 
a constant associated with the operational status, where the constant has a first valxxe if the 
operational status is "up", a second value if the operational status is down", and a third value 

25 if the operational status is "unknown". Next, a weighted average of tiie status percentages for 
the current and previous four polling periods is computed. Then, tihe state of the network 
device is determined from the weighted average. 

An object of the invention is to provide a distributed network management system 
where configuration information propagates bidirectioiuilly through the system. 

30 Another object of the invention is to provide a distributed network management 

system where configuration information can be entered at one location and propagate through 
the system. 
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Another object of the mventioii is to provide a distributed network manag^ent 
system which can be accessed through a web server. 

Another object of the invention is to provide a distributed networic management 
system where state changes propagate bidirectionally through the system. 
5 Another object of the invention is to provide a peered distributed netwoik 

management system with automatic &ilover and resynchronization. 

Another object of the invention is to provide a distributed netwoik management 
system which consolidates multiple status notifications into a single notification one based on 
an interface hierarchy. 

10 Another object of the invention is to provide a distributed network management 

system with a plug-in architecture of service, notification and utility modules. 

Another object of the invention is to provide a distributed network management 
system that can serve as an information transport 

Further obj ects and advantages of the invention will be brought out m the following 
15 portions of the specification, vsdierein the detailed description is for the purpose of fiiUy 
disclosing preferred embodiments of the invention without placing lunitations thereon. 

BRIEF DESCRIFnON OF THE DRAWINGS 
The invention will be more fiilly understood by reference to the following drawings 
vMch are for illustrative purposes only: 
20 FIG. I is a schematic diagram of the high level architecture of an embodiment of a 

distributed network management system according to the invention depicting the primary hub 
and the primary remote as being operational, and the primary hub as communicating with the 
primary remote. 

FIG. 2 is a schematic diagram of the distributed network management system of FIG. 
25 1 depicting the primary hub as bemg operational, the primary remote as being inoperational, 
the secondary remote as being operational, and the primary hub communicating with the 
secondary remote. 

FIG. 3 is a schematic diagram of flie distributed network management system of FIG. 
1 depictmg the primary hub as being moperational, the secondary hub as being operational, 
30 the primary remote as being operational, and the secondary hub conununicating with the 
primary remote. 

FIG. 4 a schematic diagram of the distributed network management system of FIG. 1 
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depicting the primary hub as being inoperational, the secondary hub as being operational, the 
primary remote as being inoperational, the secondary remote as being operational, and the 
secondary hub communicating with the secondary remote. 

no. 5 is a schematic diagram of the distributed network management system of FIG. 
5 1 depicting the primary and secondary remotes as being inoperational, and the primary and 
secondary hubs communicating with the network devices. 

FIG. 6 is a schematic diagram of an implementation of a distributed network 
management system according to the invention. 

FIG. 7 is schematic diagram showing an alternative embodiment of the distributed 
1 0 network management system implementation of FIG. 6 wherein hubs are regionalized. 

FIG. 8 is a functional block diagram of the internal architecture of a remote according 
to the present invention. 

FIG. 9 is a functional block diagram of an alternative embodiment of the remote 
architecture of FIG. 8. 

16 HG. 10 is a functional block diagram of the dNMS kernel portion of a remote 

according to the present invention. 

FIG. 11 is a schematic diagram of an integration server in the dNMS kmiel of 
FIG. 10. 

FIG. 12 is a schematic diagram of a monolithic server in flie dNMS kernel of 
20 FIG. 10. 

FIG. 13 is a schematic diagram showing data flow between the integration server of 
FIG. 1 1 and the monolithic server of FIG. 12. 

FIG. 14 is a schematic diagram depicting traffic flow between hubs and remotes 
through queuing according to the invention. 
25 DETAILED DESCRIPllON OF THE INVENTION 

Referring more specifically to the drawings, for illustrative purposes the present 
invention is embodied in the components, system and methods generally shown in FIG. 1 
through FIG. 14. It will be appreciated that the invention may vary as to configuration and 
details without departing firom the basic concepts as disclosed herein. 
30 FIG. 1 is a schematic diagram of the high level architecture 1 0 of an embodiment of a 

distributed network management system according to the present iavention. In the 
embodiment shown, the system comprises a primary hub 12 and a secondary hub 14, both of 

4 
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which can commianicate with a primary remote 1 6 and a secondary remote 1 8. ITie remotes 
hi tum commujoicate with a specific set of devices 20 on nodes 22 of the network 24, such as 
routers, to monitor network status. The network may be all or a portion of the Internet or 
other wide area network. The set of network devices is selected to provide an overall 
5 representation of the network being monitored. 

Each hub is in active communication with the other hub throu^ a full-time 
communications link 26 for redundancy, so that data received from one hub is continuously 
propagated to the other. Similarly, each remote is in active communication with the other 
remote through a full-time commxmications link 28 for redundancy and for continuously 

1 0 propagating data to the other remote. In addition, each remote is in constant communication 
with each network device. However, each remote preferably monitors the network devices 
independent of the other remote. As a result, the data acquired by a remote may disagree with 
the data acquired by the other remote, even though both remotes are monitoring the same 
network devices. Because tiie remotes operate independentiy of each other, the monitoring 

15 times could be different and aparticular remote may obs^e anetwork condition that was not 
observed by the other remote. For example, one remote may monitor conditions thirty 
seconds into each minute, vAulc anotho: remote may monitor conditions forty-five seconds 
into each minute. 

Primary hub 12 is in full-time commvmication witii primary remote 16 through 
20 communication link 30 so that changes detected by primary remote 16 is continuoTisly 

propagated to primary hub 12 as well as to secondary hub 14 through primary hub 12. In 
addition, configuration data such as which network devices to monitor can be propagated to 
primary remote 16 and to secondary remote 18 through prinoary remote 16, Note, however, 
that there is also a normally inactive communication link 32 between secondary hub 14 and 
25 secondary remote 18, a normally inactive communications link 34 between secondary hub 14 
and primary remote 16, and a normally inactive communications link 36 between primary 
hub 12 and secondary remote 18. These communications links are not necessarily direct 
physical links, however. In the preferred embodiment of tiie invention, each remote and 
network device has an address, such as an Internet Protocol (IP) address. This allows the 
30 remote or network device to be accessed over a network such as, for example, the Internet In 
addition, each hub can communicate directiy with a network device as well. 

With the architecture described above, the preferred communications hierarchy is as 

5 
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follows: 

1 . if the primary hub and the primary remote are operational, the primary hub 
communicates with the primary remote as shown in FIG. 1. 

2. if the primary hub is operational, the primary remote is inoi)erational, and the 
5 secondary remote is operational, the primary hub communicates with the secondary remote as 

shown in FIG. 2. 

3. if the primary hub is inoperational, the secondary hxib is operational, and the 
primary remote is operational, the secondary hub conununicates with the primary remote as 
shown in FIG. 3. 

10 4. if the primary hub is inoperational, the secondary hub is operational, the 

primary remote is inoperational, and the secondary remote is operational, the secondary hub 
communicates with the secondaty remote as shown in FIG. 4. 

5. if both the primary and secondary remotes are inoperational, all active hubs 
assume monitoring of the remote network as shown in FIG. 5. 

1 5 Referring now to FIG. 6, an example of a possible geographical configuration of a 

distributed network management system according to the invention is shown. In FTG. 6,, a 
fbrst set of hubs 38 is shown located in the vidnity of Seatde and a second set of hubs 40 is 
shown located in the vidnity of New York City. Also shown are several sets of remotes 42, 
44, 46, 48, 50, 52, 54, 56, and 58, each of which monitors a portion of the overall network. 

20 Note that hubs 38 monitor remotes 42, 44, 46, and 48, while hubs 40 monitor remotes 50, 52, 
54, 56, and 58. A change of state monitored by, for example, remotes 50 will propagate to 
hubs 40 in New York City, and from hubs 40 to sister hubs 38 in Seatde so that both sets of 
hubs have the same state information. 

While the foregoing configuration is scalable, the addition of a larger number of 

25 remotes or hubs can become more complex than necessary. In that event, an additional 

monitoring layer can be added above the hubs. In this way, not only are remotes assigned to 
regions of the network, but hubs are assigned to regions of the network as well. For example, 
refendng to FIG. 7, three regions 60, 62 and 64 are shown* Each region would include a 
primary and secondary hub that would be responsible for t^at region* For example, primary 

30 hub 66 and secondary hub 68 would be responsible for region A, primary hub 70 and 

secondary hub 72 would be responsible for region 62, and primary hub 74 and secondary hub 
76 would be responsible for region 64. In turn the hubs in a particular region would be 
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responsible for several sets of primary and secondary remotes in that region, such as set 78, 
78', 78" ... in region 60, set 80, 80', 80" in region 62, and set 82, 82* and 82" in region 64, 
and each set of remotes would be responsible for a portion of tiie network devices th^ein. 
The data collected by Ihe primary hubs in each region would be propagated to a primary hub 
5 aggregator 84, which in turn would propagate the data to a secondary hub aggregator 86 for 
redundancy. In ihis way, a multi-level distributed system architecture can be achieved. 

Referring now to FIG. 1 and FIG. 8, an embodiment of the internal architecture of 
primary 16 and secondary 18 remote is shown. Each remote includes a dNMS kemel 88 that, 
in addition to other functions that will be described, acquires data from the network 24. Also 
1 0 shown is a scheduler 90, vMch is a plug-in service that notifies administrative personnel that 
a problem exists on the network being monitored. 

Each remote is accessible through a client terminal 92 ruxming a browser-based 
application inter&ce. Note that data propagates from the network to each dNMS kemel 
through a data path 94, and that configuration changes received from a hub (not shown) 
1 5 propagates to each dNMS kemel through a configuration path 96. 

Optionally, the r^otes can include a collector 98, \;^ch is also a plug-in service, to 
which data fix>m flie network propagates and is stored in data files 1 00 for billing or other 
purposes. Also shown is a module 102 for mining the stored data and a module 1 04 for 

20 collating the mined data into a central database 106 accessible by a client terminal 108 for 
billing. The details of those components are not described herein as they do not form a part 
of the invention and are shown solely to indicate additional ways in which the data acquired 
by a remote can be used. In the event that such additional uses of tiie data are made, 
processing overhead ofthe remotes may increase. In that event, it is preferred to reduce the 

25 load on the primary remote by moving the auxiliary data collection functions into a separate 
remote server 110 as shown in FIG. 9. The primary remote 16 is then dedicated to 
monitoring network conditions, while server 1 10 is dedicated to the auxiliary data collection 
functions. Secondary remote 1 8 can be configured as before, or unloaded in the same way. 
Note that primary 12 and secondary 1 4 hubs in FIG. 1 would be configured in the 

30 same manner as the remotes. Note also that configuration information, as well as state 

information, propagates bidirectionally between hubs and remotes and between peers (e.g., 
hub to hub or remote to remote). 
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As can be seen, therefore, a critical element of a hub and a remote is the dNMS 
kernel 88. Referring now to FIG. 10, which shows primacy remote 16 as an example, the 
high level architecture of dNMS kernel 88 comprises an integration server 112 and a 
monolithic server 114- Integration server 1 12 communicates with client terminal 92 and 
5 monolithic server 114 communicates witii the network devices connected to network 24. 

In the case of a remote, state information relating to the network devices collected by 
monolithic server 1 14 is propagated to integration server 1 12 and then propagated to the 
integration server in primary hub 12, for example. Furthermore, in the case of a remote, 
configuration information such as the IP addresses of the network devices to be monitored is 

1 0 entered into integration server 110 from client terminal 92, from which it propagates down to 
monolithic server 1 1 2 as well as propagates up to the integration server in primary hub 12. 
Alternatively, configuration information can be entered into a hub, in which case the 
configuration information propagates down to integration server and the monolithic server n 
the remotes. While configuration information is entered into a dNMS kernel by a client 

15 terminal, state information for the networic devices is acquired. In the preferred embodiment 
of the invention, state information is derived using v/bst will be referred to herein as LTP, 
which is an acronym developed by the inventors herein. LTP provides for simple real time 
monitoring of network devices and their interfoces using ICMP, SNMP or a combination 
thereof, and employs a sliding window to compensate for minor interruptions in Internet links 

20 w IP traffic. 

In LTP according to the present invention, a polling intCTval is defined during vMch 
each ICMP server sends out a plurality of ICMP echo requests, or pings. While the polling 
interval and number of pings can vary, in the preferred embodiment ten pings are sent every 
sixty seconds, with each ping being separated by a one-second interval. Hie number of pings 

25 that are returned is converted to apercCTlage for that polling interval. 

In addition, for that same polling intoval, if the node is SNMP-enabled Cv^ch may 
not be the case for servers and other non-router equipment), an SNMP queiy is sent to the 
node on which fhc interface resides. The ^^operational status" of the interface is queried as to 
three possible states: "up", "down", and "unknown". An "unknown" operational status means 

30 that the SNMP request was never retmned and, therefore, the system does not know the 
status. 

Using the percentage of pings returned and the SNMP status, a single number is 
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generated for the polling period. This number is generated by multiplying the percentage of * 
pings returned by a constant that is assigned depending on the result of the SNMP query; 
namely, a value of one if the query returned "up", a value of zero if the query returned 
"down", and a value of 0.4 if the query returned "unknown. In essence, the SNMP query 
6 returned "up", we sunply use the percentage of returned ICMP packets. If the query returned 
"down", we discard the ICMP mfonnation and take the time period as being zero percent If 
the query returned "unknown", we assume fbat there is a routing problem and multiply the 
percentage ICMP packets by an arbitrary value of four tenths (0.4). For example, if ten out of 
ten pings are returned during a polling interval, but we were unable to obtain SNMP 
10 infomiation for that interface during that time period, the ratio for that time period would be 
forty percent (40%). Table 1 shows examples of various network conditions, given di£f»:ent 
SNMP and ICMP values, including the total ratio computed for tiie time period. 

Once ttut perc^tage is computed in this manner, the next step is to compute a 
weighted average of the percentages for current and previous four time periods. This is 
1 5 prefaably carried out by with a five element table with a slidmg window. The percentage for 
the current time is insoted in tiie rightmost (e.g., current period) dot If the current period 
slot is not empty, aU values m the table are shifted to the left by one slot ^.e., tiie oldest data 
is dropped). Therefor^ each position m the table represents a different time period's ratio. 
The leftmost slot contains data tiiat is four polling intervals old and, as the table is transvecsed 
20 to the right tiie data is more recent 

Each position in the table is also assigned a weight which affects the extent to v^ch 
that position in the table will influence the final percentage; that is, tiie state of tiie interface. 
Higher weights are assigned to the more recent polling intervals, as they are more indicative 
of the current state. Note, however, that tiie weights should not be too high; otherwise, the 
25 result will be over-notification of problems witii flie interfece. In otiier words, if the Weights 
are set too Mgh, the normal iiitennittency in the Internet will resuh m unnecessary 
notification. By keepmg the wei^ts low, some flappmg of the interfece is allowed without 
over notification. Therefore, the weights can vary and are typically set using enq>irical data. 
Table 2 shows an example of a completely filled m sliding window for an mterface 
30 that, while having an "up" operational state as far as the router is concerned, is droppmg a 
considerable number of ICMP packets. Table 3 shows tiie relationship between tiie 
percentage for the pollmg period and tiie "total ratio" once the weights are applied. To arrive 

9 
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at the forty-jBlve percent (45%) total ratio, we take all of the positions in the table into 
account The position percentage is multiplied by the weight for all positions to arrive at die 
resulting percentage for all positions. The resulting percentages are then added and divided 
by the sum of the weights. Given tiiis total percentage, the final state of the inter&ce is 
5 computed. Referring to Table 4, if flie percentage is greater than sixty percent (60%), the 
interface is considered "up". If the petx:entage is between forty p^cent (40%) and sixty 
percent (60%), the state is eitiier intermittent or unknown. However, it is unknown if and 
only if the last SNMP poll came back as "imknown"; otherwise, it is intermittent. If the ratio 
is less than forty percent (40%), the interface is "down". 

10 It can be appreciated at Has point that a hub and remote each comprise software 

executing on hardware. The hardware comprises one or more conventional computers and 
associated peripherals and communications inter&ces. The dNMS kernel is a software engine 
executable on a computer that is integral to a hub or a remote. Preferably, the engine is never 
modified; instead, for flexibility and scalabilily, the invention employs "plug-ins" to 

1 5 implement specific functions. A "plug-in" as the term is used herein is a software module 
that carries a unique file name. Additionally, the only information that need be changed in 
the dNMS kernel is the configuration information that controls tiie functioning of a plug-in 
service, such as LTP described above. The dNMS kemel sends the configuration 
information, such as device addresses and how often a plug-in should perform a specified 

20 function on one or more devices, to the plug-ins and the monolithic server, and the monolithic 
server monitors the network devices based on the configuration information acquired by the 
plug-ins. 

Monolithic server processing according to the invention can be summarized in terms 
of nodes (e.g., routers, servers, or topological containers for the same), inter&ces (e.g., 

25 physical interfaces, IP addresses), services and notifiers. While nodes and interfaces have 
states, neithra a node nor an inter&ce knows how to determine its own state. Nodes and 
inter&ces only have states because Ihey are associated with services that have a state. 
Therefore, state information is doived from services; namely, an action performed on a node 
or uiterfece that returns information. A service has a state by definition and is the only object 

30 that determines state on its own. An example of a service, as described above, is LTP. 

In the present invention, a notifier is a plug-in that routes state information to another 
service, such as scheduler 90 in FIG. 8, If a service has determined that a change of state has 
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taken place, a notifier is called. Therefore, a notifier is called when the state of a service is 
changed. In contrast, states of interfiaces and nodes are determined by their owned services, 
but a notifier is not caUed when the state ofaninterfisuse or node change^^ Note, ho wever, that 
generally speakmg the state change of a service will cause a change of state for the 
5 corresponding interface or node. 

Note, however, that the state of an interface is defined as the worst state of any of its 
services, and that the state of a node is defined as the worst state of its interfaces, sub-nodes, 
and services. This means that a state change of a node or an interface is dictated by a 
downstream state change, which may not represent all objects on that node or interface. 

10 Accordingly, to manage the amount of notifications resulting fi-om state changes on a node or 
interface, the present invention employs a "toggle notification flag" associated with nodes and 
interfaces. By setting the flag, an object will be ignored in an iq)sti:eam state determination. 
For example, if a node contains multiple inteifaces, flie state of one or more of the int^aces 
can be ignored for purposes ofdetennimi]^ the state of the node. Notifiears are not called for 

1 5 interfaces or nodes vAxo have their "toggle notification flag" set 

Referring now to FIG, 11 and FIG. 12, the preferred embodiment of the lower level 
architecture of dNMS kernel 88 is shovm. At the outset, it should be noted that this 
architecture is common to all dNMS kernels, whether tiiey reside in a hub or a remote* In 

20 FIG. 1 1 , the architecture of the integration server is shown, while the architecture of tbe 

monolithic server is shown in FIG, 12. Note that the basic architecture is the same; however, 
the fimctions are different 

A primary function of integmtion server 1 12 is to manage the configuration 
information for the network it is configured to represent, such as network 24. An integration 

25 server includes "placeholders" for each of the pli^-m services, with each placeholder having 
a unique name that corresponds to the plug-in service that monitors the network. These 
placeholders are not op^tional services, howevei^ they only represent configuration 
information that is passed to operational plug-ins located in monolithic server 1 14. The 
integration server manages this configuration information since it is connected to other 

30 integration serves in other dNMS kernels and, as discussed previously, configuration 

information propagates bidirectionally through the system. Therefore, the integration servers 
manage and route the configurations of all of the monitoring and collection services for the 

11 
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distributed network management system of the invention. 

The monolithic server shares the same architecture of the integration servCT as can be 
seeninFIG. 12. Here, howevCT, the services are operational and determine the state of 
downstream objects on the network. Note that the numbers and types of services are not 
5 lunil^ One such service is LTP as described above. Other services include, but are not 
limited to, monitoring bandwidth thresholding, temperature, power supply status, disk space, 
and environmental conditions. The system may optionally include one or more utility 
modules, such as an auto discovery module that knows how a router works and can talk to 
router to automatically add interfaces. Essentially, any software module that is not in the 

1 0 dNMS kemel itself can be "plugged-in" to the dNMS kernel to provide a service. 

As indicated previously, each service has a unique identification (e.g., service or file 
name). Referring to FIG. 13, these identifiers pennit the integration server and monolithic 
server to communicate through a conduit 1 16, which is an internal bus or other 
communications link. Uris allows state information fi-om the monoliliiic server to be 

1 5 propagated to tihie corresponding service placeholder in the integration server for further 
propagation to another dNMS kemel. It also allows for configuration information to be 
propagated firom the integration server to the monolithic server, whether flie configuration 
information originates fix>m the same or a different dNMS kemel (e.g., firom the hub or 

^ remote in vMch the dNMS kemel resides, or from another hub or remote). 

20 It will be appreciated that assigning a unique identifier to every service also allows for 

dNMS kemel to dNMS kemel communication. In addition to every service having a unique 
identifier, each identifier has a relative timestamp that denotes the last time that tiie service 
was changed. For example, when a "change" message such as an "add service" message is 
transmitted it would indicate that the change was made one-thousand (1000) seconds ago. 

25 This helps resolve time-based synchronization problems. 

Note also that every attribute type for the various objects has a change message type, 
such as polling rate, node name, ete. The reason for the time stamping is that, if two changes 
for the same attribute of the object are received, the most recent is used. More simply, if a 
more recent type change is received than what is currently recorded, the more recent 

30 information is kept instead. Note that the sender of the change does not care how the 
recipient handles the message, only that it was received. 

Referring to FIG. 14, the invention also includes a mechanism to control traffic 

12 
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between hubs and remotes. Each tune a change message is sent, it is placed into a queue. For 
example, primary remote 16 sends a message to jnimary hub 12 through queue 118, and 
messages from primary hub 12 to primary remote 16 are sent through queue 120. The 
message is then sent to the iQ)propriate recipient When the recipient acknowledges receipt, 
5 the message is dropped out of flie queue. If the recipient does not have sufScient storage to 
accept the message, it will not send an acknowledgement In that evrat, the message will stay 
in the queue indefinitely until an acknowledgment is received. For example, a remote could 
keep the message in the queue and not take the message until it has room to receive the 
message. Note that there are two reasons for a hub or remote to send a change message; 

10 when that hub or remote generates the change message, or when propagating a change 
message for another hub or remote. An example woiild be where a secondary remote 
generates a change message. The secondary remote would send it to the primary remote and, 
in turn the primary remote would propagate it up to a hub. 

The use of queues and acknowledgemCTi controls will also keep the hubs from 

1 5 becoming overloaded when all or a part of the system returns from a system frdlure. Suppose, 
for example, that a secondary hub comes on line after a fidluie and thinks that it last received 
changeinformationfromtheprmuiryhubtfairty (30) seconds ago. Also assume that the 
primary hub thinks ibat it last spoke to the secondary hub twelve-hundred (1200) seconds 
ago. In this instance, the primary hub would send a batch change representing a list of all 

20 changes in the past twelve-hundred (1200) seconds to tite secondary hub, since that is the 

oldest timestamp. This can occur m either direction. The queues exist to accommodate batch 
transactions, rather than real-time transactions. 

Another aspect of the invention involves knowing if a peer is operational; for 
example, a primary hub knowing that its corresponding secondary hub is operational and vice 

25 versa. In the present inventiori, this is not determmed simply by testiiig cermet Here, 
all systems connected to each other send "keep aUve" signals at specified intervals and look 
for "keep alive" signals from their peers at specified intervals. For example, every forty (40) 
seconds a "keep alive" signal is sent from the primary hub to the secondary hub* If a "keep 
alive" signal is not received by the secondary hub within one-hundred and eighty (1 80) 

30 seconds, the primary hub is considered to be dowiL Additionally, if a system tries to 

communicate with its peer, but camot, the peer is deemed to be down. Other polling periods 
could be used, but the foregoing empirically have been found to provide the best results. 
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Also, with regard to the anatomy of a message, each message includes a unique 
identifier, a timestamp, a change type (e.g., node add, node remove, IP address); message ID, 
and infomiation specific to the change type (e.g., node name or IP address). To prevent 
looping m the system, each time a system sends a message it puts a host name in tilie message 
and will never send a message to a system ^ose name is already in the message. 

Lastly, it will be appreciated by those skilled in the art tiiat a possible system 
configuration might involve monitoring a plurality of devices ttirough one physical cable to 
all devices. In the event that the cable becomes inoperational, each of those devices may be 
reported as being inoperational- To reduce the need for "redundant" reporting of multiple 
devices experiencing an outage \siien the outage is due to a cable or other common device 
being inoperational, we can collate all devices into one and simply report that the common 
interface is inoperational. 

Although the description above contains many specificities, these should not be 
construed as limiting the scope of the invention but as merely providing illustrations of some 
of the presently preferred embodiments of this invention. Thus the scope of this invention 
should be determined by the qipended claims and their legal equivalents. Therefore, it will 
be appreciated that the scope of the present invention fijUy encompasses other embodiments 
vAdch may become obvious to those skilled in flie art, and that the scope of the present 
invention is accordingly to be limited by nothing other than the appended claims, in which 
reference to an element in the singular is not intended to mean "one and only one" unless 
explicitiy so stated, but rather "one or more." All structural, chemical, and functional 
equivalents to the elements of the above-described preferred embodiment that are known to 
those of ordinary skill in the art are expressly mcorporated herein by reference and are 
intended to be ^compassed by the present claims. Moreover, it is not necessary for a device 
or method to address each and every problem sought to be solved by the present invention, 
for it to be encompassed by the present claims. Fmlhermore, no element, component, or 
method step in the present disclosure is intended to be dedicated to the public regardless of 
whether the element, component, or method step is e^licitiy recited in the claims. No claim 
element herem is to be constraed under the provisions of 35 U.S.C. 1 12, sixth paragraph, 
unless the element is expressly recited using the phrase "means for." 
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Table 1 

Examples for filling out one entry in the LTP sliding window 



SINGLE 
ROW FROM 
LTP 
VIEWER 




DESCRIPTION OF 
SITUATION 


ICMP 
PERCENTAGE 
RECEIVED 


SNMP 
STATUS 


RESULTING 
PERCENTAGE 
FOR TIME 
PERIOD 


-4 min 
(100%) 


UD 


noimal ud 
interface, passing 
traffic (100% ICMP 
xl 

SNMP = 100%) 


100% 


up V**/ 




-4 min (0%) 

mmm 111 % ' 


down 


finmial Hnwn 

interface, not 
passing anything 
(0% ICMP X 0 
SNMP-100%) 


v/O 


aown yJji.) ■ 




-4 min (40%) 


up 


major packet loss to 

inter&ce, but 
luicriace IS sun up 

(40%ICMPxl 
SNMP = 40%) 


40% 


up(lx) 


40% 


-4 min (36%) 


snmp- 
wiknown 


int^r&ce passing 
most traffic, but 
problem gathering 
snmp info (likely 
an smnp-renumber 
issue) (90% ICMP 


90% 


unknown 

(no 
response) 
C4x) 


36% 


-4min(0%) 


down 


routing problem 
causing pings to go 
through anyway, 

even through 
ioter&ce is down 

(or, an snmp- 
renumber issue) 
(60% ICMP xO 

SNMP=0%) 


60% 


down (Ox) 


0% 


-4 min 
(100%) 


undefined 


normal pings on an 

interface with no 
SNMP (web servCT, 
etc,), (70% 
ICMP=70%) 


70% 




70% 


-4min(.) 


up 


smnp-only 
monitoring of un- 
numbered inter&ce, 
no ICMP status at 
aU(lSNMP = 
100%) 






100% 
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Table 2 



Example ouiput for entire window of data 



1 TIME PERIOD 


PERCENTAGE 


SNMP STATE 


WEIGHT 


1 -4 mm 


33% 


up 


2x 


-3 min 


33% 


up 


2x 


-2 min 


0% 


up 


3x 


1 -1 min 


100% 


up 


3x 


1 Omin 


50% 


up 


4x 



5 

Total ratio calculation for LTP view in Table 2 



PERCENTAGE RECEIVED FOR TIME PERIOD 


WEIGHT 


RESULTING 
PERCENTAGE 


33% 


2x 


+66% 1 


33% 


2x 


+66% 


0% 


3x 


+0% 


100% 


3x 


+300% 


50% 


4x 


+200% 1 






632%/ 14 = 45% 1 



Table 4 

M^[yping of total ratio percentage to final state of LTP 



TOTAL RATIO LEVEL 


RESULTING STATE 


ratio < 40 


down 


40 < ratio < 60 


unknown or intramittent 


ratio > 60 


up 
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CLAIMS 

What is claimed is: 

1 • A disttibuted netwoik management system, comprising: 

(a) a hub server; and 

(b) a remote server, 

(c) said remote server capable of commmdcating with a network device and said 

(d) said hub server capable of communicating with said remote server and said 
network device; 

10 (e) wherein 

(i) if said hub server and said remote server are operational, said hub 
server communicates with said remote server, and 

(ii) if said hub server is operational and said remote servCT is inoperational, 
said hub server communicates with said network device* 

15 

2. A distributed network management system, comprising: 

(a) a primary hub sorver; 

(b) a secondary hub serveq and 

(c) a remote server; 

20 (d) said remote server capable of communicating with a network device, said 

primary hub server and said secondary hub server; 

(e) said primary hub server capable of communicating with said remote server and 
said secondary hub server; 

(f) said secondary hub server capable of communicating with said remote server 
26 and said primary hub server; 

(g) wherein 

(i) if said primary hub server and said remote server are operational, said 
primary hub server communicates with said remote server, and 

(ii) if said primary hub server is inoperational, said secondary hub server is 
30 operational, and said remote server is operational, said secondary hub server 

communicates with said remote server. 

17 
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3. A system as recited in claim 2, wherein said primary hub server is capable of 
communicating with said network device, and wherein if said primary hub server is 
operational and said remote server is inoperational, said primary hub server communicates 

5 with said network device. 

4. A system as recited in claim 2, \^erein said secondary hub servCT is capable of 
conmiunicating with said network device, and wherein if said primary hub server is 
inoperational and said remote server is inoperational, said secondary hub server 

1 0 communicates with said network device. 

5. A distributed network management system, comprising: 

(a) ahubs^er; 

(b) a primaiy remote server; and 
15 (c) a secondary remote server; 

(e) said primary remote server capable of communicating with a remote netw(»:k, 
said secondary remote server, and said hub server; 

(f) said secondary remote server enable of communicating with said remote 
network, said primary remote server, and said hub server; 

20 (g) said hub server capable of communicating with said primaiy remote server and 

said secondary remote server; 
(h) wherein 

(i) if said hub server and said primary remote server are operational, said 
hub servCT communicates with said primary remote server, and 
25 (ii) if said hub server is operational, said primary remote server is 

inoperational, and said secondary remote server is operational, said hub server 
communicates with said secondary remote server. 

6. A system as recited in claim 5, wheiein said hub server is capable of 

30 communicating with said network, and wherein if said hub server is operational and said 
primary and said secondary remote servers are inoperational, said hub server communicates 
with said network device. 

18 
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1. A distributed network management system, comprising: 
(a) a primary hub server; 
5 (b) a secondary hub server; 

(c) a primary remote server, and 

(d) a secondary remote server; 

(e) said primary remote server capable of communicating with a remote network, 
said secondary remote server, said primary hub server and said secondary hub server; 

10 (f) said secondary remote server capable of communicating with said remote 

network, said primary remote servo*, said primary hub server and said secondary hub server; 

(g) said primary hub server capable of conununicating with said secondary hub 
server, said primary remote server, said secondary remote server, and said remote network; 

(h) said secondary hub server capable of communicating with said primary hub 
1 5 server, said primary remote server, said secondary remote server, and said remote network; 

(i) herein 

(i) if said primary hxib server and said primary remote server are 
operational, said primary hub server communicates with said primary remote server, 

(ii) if said primary hub server is operational, said primary remote server is 
20 inoperational, and said secondary remote server is operational, said primary hub 

server commimicates with said secondary remote server, 

(iii) if said primary hub server is operational and said primary and 
secondary remote servers are inoperational, said primary hub server communicates 
with said remote network, 

25 (iv) if said primary hub server is moperational, said secondary hub server is 

operational, and said primary remote server is operational, said secondary hub server 

communicates with said primary remote server, 

(v) if said primary hub server is inoperational, said secondary hub server is 

op^tional, said primary remote server is inoperational, and said seicondary remote 
30 server is operational, said secondary hub server commimicates with said secondary 

remote server. 
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(vi) if said primary hub server is inoperadonal, said secondary hub server 
is operational, and said primary and secondary remote servers are inoperational, said 
secondary hub server communicates with said remote network, and 

(vii) if said primary hub server is operational, said secondary hub server is 
5 operational, and said primary and secondary remote servers are inoperadonal, said 

primary hub server and said secondary hub server communicate with said remote 
network. 



8, A distributed network man^ement system, comprising: 
10 (a) a hub server; 

(b) a remote server; 

(c) said remote server capable of communicatii^ with a network device and said 

hub; 

(d) said hub server capable of communicating with said remote server and said 
1 5 network device; and 

(e) programming associated with at least one of said servers for carrying out the 
operations of 

(i) if said hub server and said remote server are op^ational, causing said 
hub server to communicate with said remote server, and 
20 (ii) if said hub server is operational and said remote server is inoperational, 

causing said hub server to communicate with said network device, 

9. A distributed network management system, comprising: 
(a) a primary hub server; 

25 (b) a secondary hub server; 

(c) a remote server; 

(d) said remote server capable of conamunicating with a networic device, said 
primary hub server and said secondary hub server, 

(e) said primary hub server capable of communicating with said remote server and 
30 said secondary hub server; 

(f) said secondary hub server capable of communicating with said remote server 
and said primary hub server; and 

20 
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(g) programming associated with at least one of said servers for carrying out the 
operations of 

(i) if said primary hub server and said remote server are operational, 
causing said primary hub server to communicate vri&i said remote server, and 
5 (ii) if said primary hub server is inoperational^ said secondary hub server is 

operational, and said remote server is operational, causing said secondaiy hub server 
to communicate vsdth said remote server. 

10. A system as recited in claim 9, wherein said primary hub server is capable of 
1 0 commimicating with said network device, and further comprising programming for carrying 
out the operation of causing said primary hub server to communicate with said network 
device if said primary hub server is operational and said remote server is inop^tional. 

1 L A system as recited in claim 9, vdierein said secondary hub server is capable of 
1 5 communicating with said network device, and further comprising programming for carrying 
out the operation of causing said secondary hub servCT to communicate with said network 
device if said primary hub server is inoperational and said remote server is inoperational. 

12. A distributed network management system, comprising: 
20 (a) a hub server; 

(b) a primary remote server; 

(c) a secondary remote server; 

(e) said primary remote server capable of communicating witii a remote network, 
said secondary remote server, and said hub server; 

25 (f) said secondary remote server capable of coiimsuiucatingwitii said remote 

network, said primary remote server, and said hub server 

(g) said hub server capable of communicating witiii said primary remote server and 
said secondary remote server; and 

(h) programming associated with at least one of said servers for carrying out the 
30 operations of 

(i) if said hub server and said primary remote server are operational, 

21 
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causing said hub server to commxmicate with said primary remote server, and 
(ii) if said hub server is operational, said primary remote server is 
inoperational, and said secondary remote server is operational, causing said hub server 
to communicate with said secondary remote server* 

13 . A system as recited in claim 12, wherein said hub server is capable of 
communicating with said network, and further comprising programming for carrying out the 
operation of causing said hub server to communicate with said network device if said hub 
server is operational and said primary and said secondary remote servers are inoperational. 



14. A distributed network management system, comprising: 

(a) a primary hub server; 

(b) a secondary hub server; 

(c) a primary remote server; 
15 (d) a secondary remote server; 

(e) said primary remote server capable of communicating with a remote netwoik, 
• said secondary remote server, said primary hub server and said secondary hub server; 

(f) said secondaiy remote server capable of communicating with said remote 
network, said primaiy remote server, said primary hub serv^ and said secondary hub server; 

20 (g) said primary hub server capable of conmaunicating with said secondary hub 

server, said primary remote server, said secondary remote server, and said remote network; 

(h) said secondary hub server capable of commvinicating with said primary hub 
server, said primary remote server, said secondary remote server, and said remote network; 
and 

25 (i) programming associated with at least one of said serv^ for carrying out the 

operations of 

(i) if said primary hub server and said primary remote server are 
operational, causing said primary hub server to communicate with said primary 
remote server, 

30 (ii) if said primary hub server is operational, said primary remote server is 

inoperational, and said secondary remote server is operational, causing said primary 
hub server to communicate with said secondary remote server, 

22 



BNSOOCID: <WO_Oaoa211AlJ_> 



wo 02/0321 1 PCTAJSOO/23728 

(iii) if said primaiy bub server is Operational and sai 

secondary remote servers are inoperational^ causing said pnxxt&ry hub server to 
communicate >vitfa said lemote network; 

(iv) if said primary hub server is inoperational, said secondary hub server is 
5 operational, and said primary remote server is operational, causing said secondary hub 

server to communicate with said primary remote server, 

(v) if said primary hub server is inoperational, said secondary hub server is 
operational, said primary remote server is inoperational, and said secondary remote 
server is operational, causing said secondary hub server to communicate with said 

1 0 secondary remote server, 

(vi) if said primary hub server is inoperational, said secondary hub server 
is operational, and said primary and secondary remote servers are inqperational, 
causing said secondary hub server to communicate with said remote network, and 

(\di) if said primary hub server is operational, said secondary hub server is 
^ 5 operational, and said primary and secondary remote servers are inoperational, causing 

said primary hub server and said secondary hub server to communicate with said 
remote network. 



15. A distributed network management system, comprising: 
20 (a) a hub server; and 

(b) a remote server; 

(c) said remote server capable of communicating with a network device and said 
hub server; 

(d) wherein configuration parameters for said remote server to commimicate with 
25 said network device can be propagated between said hub server and said remote server 

bidirectionally« 



30 1 6. A distributed network management system, comprising: 

(a) a network server capable of communicating with a network device; and 

(b) means associated with said network server for deriving state information jfrom 
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17. A system as recited in claim 16, wherein said LTP comprises: 
(a) defining a polling interval; 

5 (b) sending, fix>m an ICMP server, a plurality of pings to an interface address on 

said network device during said polling interval; 

(c) monitoring tiie number of pings returned from said network device and 
converting said number to a percentage based on the number of pings sent; 

(d) sending an SNMP query to said network device and determining operational 

1 0 status of said network device from said SNMP query, said operational status comprising "up", 
"down", and "unknown"; 

(e) using the percentage of pings returned and the SNMP status, generating a 
status percentage for the polling period by multiplying the percentage pings returned by a 
constant value associated with said operational status, said constant value comprising a first 

1 5 value if the operational status is "up", a second value if the operational status is down", and a 
third value if the operational status is "unknown"; and 

(f) computing a weighted average of the status percentages for current and 
previous four polling periods and determming tixe state of tiie network device from the 
weighted average. 

20 

18. A system as recited in claim 1 6, further comprising: 

(a) means for defining a polling interval; 

(b) means for sending, from an ICMP server, a plurality of pings to an inter&cc 
address on said network device during said polling interval; 

25 (c) means for monitoring the number of pings returned from said network device 

and converting said number to a percentage based on the number of pings sent; 

(d) means for sendmg an SNMP query to said network device and determining 
operational status of said network device fix)m said SNMP query, said operational status 
comprising "up", "down", and "unknown"; 

30 

(e) means for usmg the percentage of pings returned and the SNMP status, 
generating a status percentage for the polling period by multiplying the percentage pings 

24 
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returned by a constant value associated with said operational status, said constant value 
comprising a first value if tiie operational status is "up", a second value if the operational 
status is down", and a third value if the operational status is "unknown"; and 

(f) means for computing a weighted avaago of the status percentages for current 
5 and previoxis four polling periods and determining the state of the network device fix>m the 
weighted average. 



19. A system as recited in claim 16, further comprising programming associated 
with said network server for carrying out the functions of: 

1 0 (a) defining a polling interval; 

(b) sending, from an ICMP server, a pIuraliQr of pings to an interface address on 
said network device during said polling interval; 

(c) monitoring the number of pings returned from said network device and 
converting said number to a percentage based on tiie number of pings sent; 

15 (d) sending an SNMP queiy to said network device and detemiining operat^^ 

status of said network device from said SNMP query, said operational status ccmiprising "up", 
"down", and "tmknown"; 

(e) using the percentage of pings returned and the SNMP status, generating a 
status percentage for the polling period by multiplying the percentage pings returned by a 

20 constant value associated with said operational status, said constant value comprising a first 
value if the operational status is "up", a second value if the operational status is down", and a 
third value if the operational status is "imknown"; and 

(f) computing a weighted averse of the status percentages for current and 
previous four polling periods and determining the state of the network device from the 

25 weighted average. 

20. A system for deriving state information from a network device, comprising: 

(a) a computen and 

(b) programming associated with said computer for carrying out the operations of 

30 

(i) defining a polling interval; 

(ii) sending, fix>m an ICMP server, a plurality of pings to an interface 
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address on said network device during said polling interval; 

(iii) monitoring the number of pings returned from said netwodc device and 
convoiing said number to a percentage based on the number of pings sent; 

(iv) sending an SNMP query to said network device and determining 

5 op^tional status of said network device from said SNMP query, said operational 

status comprising "up", "down", and "unknown"; 

(v) using the percentage of pings returned and the SNMP status, 
generating a status percentage for the polling period by multiplying the percentage 
pings retumed by a constant value associated with said operational status, said 

10 constant value comprising a first value if the operational status is "up", a second value 

if the operational status is down", and a third value if the operational status is 
"unknown"; and 

(vi) computing a weighted average of the status percentages for current and 
previous four polling periods and determining the state of the network device frorh the 

1 5 wei^ted average. 

21. A method for distributed network management, comprising: 

(a) providing a hub server; 

(b) providing a remote server; 

20 (c) said remote server capable of commimicating with a network device and said 

hub; 

(d) said hub server capable of communicating with said remote servCT and said 
network device; 

(e) if said hub server and said remote server are operational, causing said hub 
25 server to communicate with said remote server; and 

(f) if said hub server is operational and said remote server is inoperational, 
causing said hub server to communicate with said network device. 

22. A method for distributed network management, comprising: 
30 (a) providing a primary hub server; 

(b) providing a secondary hub server; 

(c) providing a remote server; 

26 
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(d) said remote server capable of communicating with a network device, said 
primary hub server and said secondary hub server; 

(e) said primary hub server capable of communicating with said remote server and 
said secondary hub server; 

5 (f) said secondary hub server capable of communicating with said remote server 

and said primary hub server; 

(g) if said primary hub server and said remote server are operational, causing said 
primary hub server to communicate with said remote server, and 

(h) if said primary hub server is inoperational, said secondary hub server is 
1 0 operational, and said remote server is operational, causing said secondary hub server to 

communicate with said remote server. 



23 . A system as recited in claim 22, \^erein said primary hub server is capable of 
communicating with said network device, and further comprising causing said primary hub 

1 5 server to communicate with said network device if said primary hub server is operational and 
said remote server is inoperational. 

24. A sfystem as recited in claim 22, wherein said secondary hub server is capable 
of communicating with said network device, and further comprising causing said secondary 

20 hxib server to commimicate with said network device if said primary hub server is 
inoperational and said remote server is inopemtional. 

25. A method for distributed network management, comprising: 
(a) providing a hub server; 

25 (b) providing a primary remote server; 

(c) providing a secondary remote server; 

(e) said primary remote server capable of communicating with a remote network, 
said secondary remote server, and said hub servo:; 

(f) said secondary remote server capable of communicating with said remote 
30 network, said primary remote server, and said hub server; 

(g) said hub server capable of coromunicating with said primary remote server and 
said secondary remote server; 
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(h) if said hub server and said primary remote server are operational, causing said 
. hub server to conmiunicate with said primary remote server; and 

(i) if said hub server is operational, said primary i^ote server is inoperational, 
and said secondary remote server is operational, causing said hub server to communicate with 

5 said secondary remote server. 



26. A method as recited in claim 25, wherein said hub server is capable of 
communicating with said network, and further comprising causing said hub server to 
commimicate with said network device if said hub server is operational and said primary and 

10 said secondary remote servers are inoperational. 

27. A method for distributed network management, comprising: 

(a) providing a primary hub server; 

(b) providing a secondary hub server, 
1 5 (c) providing a primary remote server; 

(d) providing a secondary remote server; 

(e) said primary remote server capable of communicating with a lemote network, 
said secondary remote server, said primary hub server and said secondary hub server; 

(f) said secondary remote server capable of communicating with said remote 

20 network, said primary remote server, said primary hub server and said secondary hub server; 

(g) said primary hub server capable of cormnimicating with said secondary hub 
server, said primary remote server, said secondary remote server, and said remote network; 

Qx) said secondary hub server enable of communicating with said primary hub 

server, said primary remote server, said secondary remote swver, and said remote network; 
25 (i) if said primary hub server and said primary remote server are operational, 

causmg said primary hub server to communicate with said primary remote server; 

(j) if said primary hub server is operational, said primary i^mote server is 

inoperational, and said secondary remote server is operational, caxising said primary hub 

server communicates with said secondary remote server; 
30 (k) if said primary hub server is operational and said primary and secondary 

remote servers are inoperational, causing said primary hub server to commimicate with said 

remote network; 
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(1) if said primary hub server is inoperadonal, said secondary hub server is 
operational, and said primary remote server is operational, causing said secondary hub server 
to communicates with said primary remote server, 

(m) if said primary hub server is inoperational, said secondary hub server is 
5 op^ational, said primary remote server is inoperational, and said secondary remote server is 
operational, causing said secondary hub server to communicate with said secondary remote 
server; 

(n) if said primary hub server is inoperational, said secondary hub server is 
operational, and said primary and secondary remote servers are inoperational, causing said 
1 0 secondary hub SCTver communicates with said remote network; and 

(o) if said primary hub server is operational, said secondary hub server is 
operational, and said primary and secondary remote servers are inoperational, causing said 
primary hub server and said secondary hub server to communicate with said remote network. 

15 28. A method for distributed network management, comprising: 

(a) providing a hub server; 

(b) providing a remote server; 

(c) said remote s^er cs^able of communicating with a network device and said 
hub server; and 

20 (d) propagating configuration paramet^ for said remote server to communicate 

with said network device between said hub server and said remote server bidirectionally . 

29. A method for distributed network management, comprisuig: 

(a) providing a network server csqpable of communicating with a network device; 



25 and 



(b) deriving state information from said network device using LTP. 



30. A method as recited in claim 29, wherein said LTP comprises: 
30 (a) defining a polling interval; 

(b) sending, from an ICMP server, a plurality of pings to an interface address on 
said network device during said polling interval; 
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(c) monitoring the number of pings retumed from said network device and 
converting said number to a percentage based on the number of pings sent; 

(d) sending an SNMP query to said network device and determining operational 
status of said network device from said SNMP query, said operational status comprising "up", 

5 "down", and "unknown"; 

(e) using the percentage of pings retumed and the SNMP status, generating a 
status percentage for the polling period by multiplying the percentage pings retumed by a 
constant value associated with said operational status, said constant value comprising a first 
value if the operational status is "up", a second value if the operational status is down", and a 

10 third value if the operational status is "unknown"; and 

(f) computing a weighted average of the status percentages for current and 
previous four polling periods and determining the state of the network device from the 
weighted average. 

15 31. A method for deriving state information from a network device, comprising: 

(a) defining a polling interval; 

(b) sending, from an ICMP server, a plurality of pings to an interface address on 
said network device during said polling interval; 

(c) monitoring the number of pings retumed from said network device and 
20 converting said number to a percentage based on the nimiber of pings sent; 

(d) sending an SNMP query to said network device and determining operational 
status of said network device from said SNMP query, said operational status comprising "up", 
"down", and "unknown"; 

(e) using the percentage of pings retumed and the SNMP status, generating a 
25 status percentage for the polling period by multiplying the percentage pings retumed by a 

constant value associated with said operational status, said constant value comprising a first 
value if the operational status is "up", a second value if the operational status is down", and a 
third value if the operational status is "unknown"; and 

(f) computing a weighted average of the status percentages for current and 
30 previous four polling periods and determining the state of the network device from the 

weighted average. 
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